De-anonymizing Web Browsing Data with Social Networks
نویسندگان
چکیده
Can online trackers and network adversaries de-anonymize web browsing data readily available to them? We show— theoretically, via simulation, and through experiments on real user data—that de-identified web browsing histories can be linked to social media profiles using only publicly available data. Our approach is based on a simple observation: each person has a distinctive social network, and thus the set of links appearing in one’s feed is unique. Assuming users visit links in their feed with higher probability than a random user, browsing histories contain tell-tale marks of identity. We formalize this intuition by specifying a model of web browsing behavior and then deriving the maximum likelihood estimate of a user’s social profile. We evaluate this strategy on simulated browsing histories, and show that given a history with 30 links originating from Twitter, we can deduce the corresponding Twitter profile more than 50% of the time. To gauge the real-world e↵ectiveness of this approach, we recruited nearly 400 people to donate their web browsing histories, and we were able to correctly identify more than 70% of them. We further show that several online trackers are embedded on su ciently many websites to carry out this attack with high accuracy. Our theoretical contribution applies to any type of transactional data and is robust to noisy observations, generalizing a wide range of previous de-anonymization attacks. Finally, since our attack attempts to find the correct Twitter profile out of over 300 million candidates, it is—to our knowledge—the largestscale demonstrated de-anonymization to date. CCS Concepts •Security and privacy ! Pseudonymity, anonymity and untraceability; •Information systems ! Online advertising; Social networks; Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. c 2017 ACM. ISBN TDB. DOI: TBD
منابع مشابه
De-anonymizing social networks
The problem of de-anonymizing social networks is to identify the same users between two anonymized social networks [7] (Figure 1). Network de-anonymization task is of multifold significance, with user profile enrichment as one of its most promising applications. After the deanonymization and alignment, we can aggregate and enrich user profile information from different online networking service...
متن کاملPrediction Promotes Privacy in Dynamic Social Networks
Recent work on anonymizing online social networks (OSNs) has looked at privacy preserving techniques for publishing a single instance of the network. However, OSNs evolve and a single instance is inadequate for analyzing their evolution or performing longitudinal data analysis. We study the problem of repeatedly publishing OSN data as the network evolves while preserving privacy of users. Publi...
متن کاملA Passive External Web Surveillance Technique for Private Networks
The variety and richness of what users browse on the Internet has made the communications of web-browsing hosts an attractive target for surveillance. We show that passive external surveillance of webbrowsing hosts in private networks is possible despite the anonymizing effects of NATs and HTTP proxies at the gateway. These devices effectively anonymize the origin of communication streams, and ...
متن کاملTracing Misbehaving Users by Utilizing Ticket-Based Protocols by Trusted Third Party in Anonymizing Networks
–Anonymizing networks provides network services to users without specific identity. Network administrator cannot identify user actions in anonymizing networks. Anonymizing networks such as The Onion Routing Networks (TOR) uses a layer structured encrypted message and series of routers each with a key to decrypt and forward the message. Which hide’s the client’s IP address from the server. The l...
متن کاملAn Iterative Algorithm for Graph De-anonymization
The availability of social network data is indispensable for numerous types of research. Nevertheless, data owners are often reluctant to release social network data, as the release may reveal the private information of the individuals involved in the data. To address this problem, several techniques have been proposed to anonymize social networks for privacy preserving publications. To evaluat...
متن کامل